AITopics | noise schedule

Collaborating Authors

noise schedule

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continuous Diffusion Scales Competitively with Discrete Diffusion for Language

Yang, Zhihan, Guo, Wei, Zhang, Shuibai, Sahoo, Subham Sekhar, Chen, Yongxin, Vahdat, Arash, Mardani, Morteza, Thickstun, John

arXiv.org Machine LearningMay-19-2026

While diffusion has drawn considerable recent attention from the language modeling community, continuous diffusion has appeared less scalable than discrete approaches. To challenge this belief we revisit Plaid, a likelihood-based continuous diffusion language model (DLM), and construct RePlaid by aligning the architecture of Plaid with modern discrete DLMs. In this unified setting, we establish the first scaling law for continuous DLMs that rivals discrete DLMs: RePlaid exhibits a compute gap of only $20\times$ compared to autoregressive models, outperforms Duo while using fewer parameters, and outperforms MDLM in the over-trained regime. We benchmark RePlaid against recent continuous DLMs: on OpenWebText, RePlaid achieves a new state-of-the-art PPL bound of $22.1$ among continuous DLMs and superior generation quality. These results suggest that continuous diffusion, when trained via likelihood, is a highly competitive and scalable alternative to discrete DLMs. Moreover, we offer theoretical insights to understand the advantage of likelihood-based training. We show that optimizing the noise schedule to minimize the ELBO's variance naturally yields linear cross-entropy (information loss) over time. This evenly distributes denoising difficulty without any case-specific time reparameterization. In addition, we find that optimizing embeddings via likelihood creates structured geometries and drives the most significant likelihood gain.

large language model, machine learning, replaid, (15 more...)

arXiv.org Machine Learning

2605.1853

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Health & Medicine (0.67)
Education > Educational Setting > K-12 Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.66)

Add feedback

Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation

Neural Information Processing SystemsApr-29-2026, 20:02:26 GMT

To achieve the highest perceptual quality, state-of-the-art diffusion models are optimized with objectives that typically look very different from the maximum likelihood and the Evidence Lower Bound (ELBO) objectives. In this work, we reveal that diffusion model objectives are actually closely related to the ELBO. Specifically, we show that all commonly used diffusion model objectives equate to a weighted integral of ELBOs over different noise levels, where the weighting depends on the specific objective used. Under the condition of monotonic weighting, the connection is even closer: the diffusion objective then equals the ELBO, combined with simple data augmentation, namely Gaussian noise perturbation. We show that this condition holds for a number of state-of-the-art diffusion models. In experiments, we explore new monotonic weightings and demonstrate their effectiveness, achieving state-of-the-art FID scores on the high-resolution ImageNet benchmark.

artificial intelligence, machine learning, noise schedule, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

260a14acce2a89dad36adc8eefe7c59e-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 03:46:33 GMT

artificial intelligence, dpm-solver, machine learning, (12 more...)

Neural Information Processing Systems

Country: Asia > China (0.15)

Genre: Research Report (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Appendices: Score-based Source Separation with Applications to Digital Communication Signals

Neural Information Processing SystemsApr-24-2026, 23:09:51 GMT

During source separation, this presumably results in a noisier estimate of the SOI in comparison to mixtures with no additional background noise. We estimated the SNR to be 16.9dB by averaging across multiple samples. The dotted black curve on the left of Figure G.2 is a presumable lower bound on the BER by accounting for the magnitude of the background noise and modeling it as additive white Gaussian noise.

artificial intelligence, constellation, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.93)

Add feedback

106b2434b8d496c6aed9235d478678af-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 23:09:49 GMT

artificial intelligence, diffusion model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(2 more...)

Add feedback

Noise Schedule

Neural Information Processing SystemsApr-24-2026, 22:45:08 GMT

Because a diffusion model shares parameters for all diffusion steps, the noise schedule (parametrized by 1:T) is an important hyperparameter that determines how much weight we assign to each denoising problem. We find that standard noise schedules for continuous diffusions are not robust for text data. We hypothesize that the discrete nature of text and the rounding step make the model insensitive to noise near t =0 . Concretely, adding small amount of Gaussian noise to a word embedding is unlikely to change its nearest neighbor in the embedding space, making denoising an easy task near t =0 . To address this, we introduce a new sqrt noise schedule that is better suited for text, shown in Figure 5 defined by t =1 p t/T +s, where s is a small constant that corresponds to the starting noise level11. Compared to standard linear and cosine schedules, our sqrt schedule starts with a higher noise level and increase noise rapidly for the first 50 steps. Then sqrt slows down injecting noise to avoid spending much steps in the high-noise problems, which may be too difficult to solve well. The hyperparameters that are specific to Diffusion-LM include the number of diffusion steps, the architecture of the Diffusion-LM, the embedding dimension, and the noise schedule, . We set the diffusion steps to be 2000, the architecture to be BERT-base [7], and the sequence length to be 64. For the embedding dimensions, we select from d 2{ 16,64,128,256} and select d = 16for the E2E dataset and d = 128for ROCStories. For the noise schedule, we design the sqrt schedule (Appendix A) that is more robust to different parametrizations and embedding dimensions as shown in Appendix M. However, once we picked the x0-parametrization ( 4.2) the advantage of sqrt schedule is not salient. We train Diffusion-LMs using AdamW optimizer and a linearly decay learning rate starting at 1e-4, dropout of 0.1, batch size of 64, and the total number of training iteration is 200K for E2E dataset, and 800K for ROCStories dataset. Our Diffusion-LMs are trained on a single GPU: NVIDIARTXA5000, NVIDIAGeForce RTX 3090, or NVIDIAA100.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Industry:

Consumer Products & Services > Restaurants (1.00)
Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

054f771d614df12fe8def8ecdbe4e8e1-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:38:31 GMT

arxiv preprint arxiv, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Diffusion4Audio (12)

Robin San Roman

Neural Information Processing SystemsApr-24-2026, 07:38:27 GMT

arxiv preprint arxiv, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ANT: Adaptive Noise Schedule for Time Series Diffusion Models

Neural Information Processing SystemsMar-22-2026, 16:38:05 GMT

Advances in diffusion models for generative artificial intelligence have recently propagated to the time series (TS) domain, demonstrating state-of-the-art performance on various tasks. However, prior works on TS diffusion models often borrow the framework of existing works proposed in other domains without considering the characteristics of TS data, leading to suboptimal performance. In this work, wepropose Adaptive Noise schedule for Time series diffusion models (ANT), which automatically predetermines proper noise schedules for given TS datasets based on their statistics representing non-stationarity. Our intuition is that an optimal noise schedule should satisfy the following desiderata: 1) It linearly reduces the non-stationarity of TS data so that all diffusion steps are equally meaningful, 2) the data is corrupted to the random noise at the final step, and 3) the number of steps is sufficiently large. The proposed method is practical for use in that it eliminates the necessity of finding the optimal noise schedule with a small additional cost to compute the statistics for given datasets, which can be done offline before training.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing

Neural Information Processing SystemsMar-22-2026, 15:01:20 GMT

Text-guided diffusion models have significantly advanced image editing, enabling high-quality and diverse modifications driven by text prompts. However, effective editing requires inverting the source image into a latent space, a process often hindered by prediction errors inherent in DDIM inversion. These errors accumulate during the diffusion process, resulting in inferior content preservation and edit fidelity, especially with conditional inputs. We address these challenges by investigating the primary contributors to error accumulation in DDIM inversion and identify the singularity problem in traditional noise schedules as a key issue. To resolve this, we introduce the, a novel noise schedule designed to eliminate singularities, improve inversion stability, and provide a better noise space for image editing. This schedule reduces noise prediction errors, enabling more faithful editing that preserves the original content of the source image. Our approach requires no additional retraining and is compatible with various existing editing methods. Experiments across eight editing tasks demonstrate the Logistic Schedule's superior performance in content preservation and edit fidelity compared to traditional noise schedules, highlighting its adaptability and effectiveness. The project page is available at https://lonelvino.github.io/SYE/.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback